Modeling of between-speaker and within-speaker variation in spontaneous speech tempo
نویسنده
چکیده
Speech tempo (speaking rate) varies both between and within speakers. Previous research suggests several relevant factors and predictors. The present study investigates all these factors combined, both between and within speakers, in a large corpus of spoken Dutch interviews. This is done by means of multi-level modeling of sex, age, and dialect region (all between speakers) and phrase length and sequential position of phrase within session (both within speakers). Results show that speech tempo depends mainly on phrase length, and not on between-speaker factors sex, age, or dialect region. Withinspeaker tempo variations exceed the JND. Separate modeling of phrase length itself reveals significant negative effects of age and of sequential position, but not of region or sex. When taken together, these results underline the phonetic and communicative importance of within-speaker variations in speech tempo.
منابع مشابه
Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo.
Speech tempo (articulation rate) varies both between and within speakers. The present study investigates several factors affecting tempo in a corpus of spoken Dutch, consisting of interviews with 160 high-school teachers. Speech tempo was observed for each phrase separately, and analyzed by means of multilevel modeling of the speaker's sex, age, country, and dialect region (between speakers) an...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملSpeaker independent acoustic modeling using speaker normalization
This paper proposes a novel speaker-independent (SI) modeling for spontaneous speech data from multiple speakers. The SI acoustic model parameters are estimated by individual training for inter-speaker variability and for intraspeaker phonetically related variation in order to obtain a more accurate acoustic model. The linear transformation technique is used for the speaker normalization to ext...
متن کاملRhythmic variability between speakers: articulatory, prosodic, and linguistic factors.
Between-speaker variability of acoustically measurable speech rhythm [%V, ΔV(ln), ΔC(ln), and Δpeak(ln)] was investigated when within-speaker variability of (a) articulation rate and (b) linguistic structural characteristics was introduced. To study (a), 12 speakers of Standard German read seven lexically identical sentences under five different intended tempo conditions (very slow, slow, norma...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل